For this first exercise you will be examining how you can use various software for geospatial analysis and graphical display of spatial data. Because of the variety and availability of operating systems in use today, each of the exercises in this course can be completed in either ArcGIS Pro, QGIS, or R. Each of these software have their pro’s and con’s but for simplicity purposes here is a table that might suggest which software would be best for you:

ArcGIS Pro QGIS R
Available on Windows PC Only Available for PC, Mac, or Linux (limited) Available for PC, Mac, Linux, ChromeOS, iOS, or Android via web modern browser
Available by subscription only Open Source Open Source
Graphical User Interface with built-in functionality for Python; add-ons available for statistical languages Graphical User Interface with built-in functionality for Python; use of plug-ins for other languages and analyses Scripting language, mostly used for statistical purpose; packages expand its use to other applications such as geospatial analyses
Widely used in academia, government, and industry jobs Widely used in academia and industry Predominantly used in academia, research, and specialized industry jobs
Help available at ESRI.com and through various outlets via a google search Help available at QGIS.org and through sites like stackoverflow.com or other sites via google search Help available through individual package documentation and sites like stackoverflow.com or other sites via google search

For the purposes of this course, you can complete the laboratory exercises in any of these software. There will be drop-down menus in every lab color-coded for the three software. Ultimately, you will have access to all of this material during and after the course and while the steps might be different between each version the outcomes will be the same. So if you choose to complete the course with one software you can always go back and complete the work in another software in your own time.

While all of the software will be available on the computers in McCord 210, if you choose to use ArcGIS Pro you will need to contact the APSU GIS Center with your student information to obtain a one year license for your personal computer. QGIS is available for download using this link. If you are completing the course in R you will not need to install any software, however you will need a Google account and access to the Chrome browser. When using R you will be completing all of the lab exercises in the Google Colaboratory and will access everything you need from the exercise pages on GitHub.

More information for each software and the steps to complete this exercise can be found below. Be sure to follow the color-coded drop-down for the specific software you are using for the lab.

View information for ArcGIS Pro

ArcGIS Pro is a proprietary software from ESRI. Student access is available in McCord 210 and several other computer labs on campus. If you wish to obtain the software for use on your personal computer (Windows-based PC only) you will need to contact the APSU GIS Center at 601 N 2nd St, Clarksville, TN 37040, (931) 221-7500. Please have your APSU student ID and A# available for verification prior to obtaining a license. Please contact the GIS Center if you have any issues downloading or installing the software.


View information for QGIS

If you will be using QGIS for the exercises you will need to have the software installed on your computer. QGIS is available for PCs, Mac, and Linux computers. You can find the appropriate download for your operating system here:

When installing on macOS for the first time you may need to go to Systems Preferences > Security and Privacy and on the General tab “Allow apps downloaded from” App Store and identified developers and click Open Anyway. Please contact your instructor if you have any issues downloading or installing the software.


View information for R

You will be completing the R portion of each exercise using the Google Colaboratory Executable Environment. Colaboratory, or “colab” for short, is based on the popular Jupyter Notebooks and allows you to write and execute R or Python in your browser, with:

  • Zero configuration required
  • Free access to GPUs
  • Easy sharing

Each colab notebook is separated into code cells and text cells. A code cell is used to write and execute a script interactively. Each text cell uses simple markdown syntax for creating plain text information. You can easily share the notebook like other Google Drive based documents by clicking the “Share” link in the upper right-hand portion of the window.

Before beginning each exercise using R, you will need to open the Exercise Colab Notebook and alter the URL to view/edit the file in colab. To do this you will navigate to the exercise’s GitHub page; in this case https://github.com/chrismgentry/GIS1-Exercise-2. In the list of folders and files you will find a file that will always be identified as GIS1_EX followed by the exercise number and a .ipynb file extension. So for this exercise it will be GIS1_EX2.ipynb

Finding the Google Colab file on GitHub


When you locate the file, click on the name to open the ipy notebook. In the resulting window, click on the URL and insert tocolab immediately after github in the address. So the new URL for this exercise should now read:
https://githubtocolab.com/chrismgentry/GIS-Exercise-2/blob/main/GIS1_EX2.ipynb

Converting ipy file to colab


Click enter/return and the file should now open inside of Colab. Your screen might have a slightly different appearance, but you should see the Exercise 2, GIS 1 header indicating you are in the correct notebook.

Converting ipy file to colab


Each time you open these colab notebooks, or if you have not interacted with the notebook for an extended period of time, you will need to make sure the environment is connected and all of the sample script has been run. To do this go to Runtime > Run all (or click CRTL+F9) and run the notebook. This may take a moment to complete so be patient until the last code cell has been executed. You should see a green check mark with an allocation of RAM and Disk as shown by horizontal bars which will indicate you are connected. In each section of code there is a Run cell  Run cell button button that will allow you to run the individual code cells. This button will have a rotating loading symbol Run cell button while the code cell is being executed. Once it is complete the box will return to the run state and there may be an output visible depending on the script. You can add your own text or code cells to a notebook simply by moving your cursor slowly over the notebook to reveal the code/text option bar or by going to Insert > Code cell/Text cell.

Adding Text or Code Blocks


By clicking within each code or text cell you can edit the contains and adjust the properties using the various preset option controls.

Edit block properties


You should practice adding and removing cells, editing their content, and rearranging the order of the cells. This will help familiarize you with notebook tools and allow you to manipulate notebooks in these exercises and create your own notebooks in the future.

For each new exercise there will be a colab notebook available. Within the notebook will be sample code for you to run, code cells for you to complete as part of the exercise and questions to be answered within labeled text cells. Detailed directions will be available on the individual exercise webpage. Each completed exercise will have completed code and question sections and the notebook link shared for a grade.

1 The Introduction

The Tennessee Valley Authority (TVA) and Tennessee Department of Environment and Conservation (TDEC) are considering a partnership to increase the number of electric vehicle charging stations across the state. With destinations such as Nashville, Memphis, Gatlinburg and the Great Smoky Mountains, Bristol Motor Speedway, Signal Mountain and the Tennessee Aquarium, tourism is a major part of the state’s economy. So accommodating residents and visitors with electric vehicles as well as creating a network for those traveling the states major interstates (I-65, I-40, I-24, I-75, etc.) is of utmost importance.

TDEC says this network of charging stations will also promote electric vehicle growth by giving drivers confidence they will have easy access to refueling while away from home, eliminating so-called “range anxiety” that keeps many consumers from considering electric vehicles as viable transportation. TVA says electric vehicle adoption will spur jobs and economic investment in the region, keep refueling dollars in the local economy, reduce the regions largest source of carbon emissions, and save drivers and fleets money.

So TVA and TDEC have asked you to develop a map showing the number of public electric vehicle charging stations in the continental United States. This will help them understand what the nations current infrastructure looks like and how Tennessee currently compares to neighboring states. It will also help TVA and TDEC determine approximately how many additional stations they should add.

In this exercise you will:

  • Navigate software used for geospatial analysis
  • Examine attributes of various datasets
  • Learn how to graphically display spatial data
  • Create a cartographically sound product

Software specific directions can be found for each step below. Please submit the answer to the questions and your final map by the due date.

1.1 Step One: The Data

Insert Text Here

View Directions in ArcGIS Pro

Insert Text Here

Question No. 1
Insert Text Here

View Directions in QGIS

The purpose of this exercise is to help familiarize you with the QGIS program. With this software, you will have the ability to view, create, and edit geographic data, query spatial data, examine spatial relationships, and create publishable maps. In these directions I will post images from both PC (on the left) and macOS (on the right) for each step. While your set-up may differ slightly, the configurations and layout of the tools should be similar.

When you begin a new exercise it would be beneficial to create a new folder specifically for the data and project files associated with the lab. This will help you to organize your data and be able to quickly refer back to it in later exercises. For this exercise you will need to download the Electric Vehicle Charging Station data from the GitHub Repository by clicking on the download button at this link and saving it in your exercise folder. Once the download is complete you will need to unzip the file by using Control+click on macOS or right-click > extract all on PC.

Data Download from GitHub


To start the QGIS program press the Start Menu > QGIS or launch the application from the Mac Launchpad. It might be beneficial to create a shortcut on the desktop or add it to your dock for quicker access in the future. As the program loads, the opening screen will be extremely useful to you while working on future projects, however, for this exercise we will select New Empty Project.

QGIS Start Screen


The screen depicted below is separated into two sections. To the left there is the Browser and Layers sections, and on the right is the Map Canvas. Your version may vary slightly from the images. There are a number of ways to add data, but in this example we are going to use the browser window to connect to our ESRI Data folder.

QGIS Start Screen


In the Browser window on the left, navigate to the folder where you saved and unzipped the exercise data folder. Once you locate a folder in folder in the browser, you can use control+click (macOS) or right-click (PC) to Add as a favorite. This will link it to the favorites drop-down in the browser window giving you quicker future access. To add the data, you will navigate to the ev_data.shp file and then click-and-drag it to the map canvas. You will be able to add all types of data in this same way. If there is a pop-up window that mentions transformations go ahead and click OK. While the look of the US in this dataset is less than ideal, the purpose of this exercise is to ensure you know how to display data and basic tools for navigating the software. Transformations will be discussed further in a couple weeks after a lecture on projections and coordinate systems. For now, leave it with option 1 and click OK.

QGIS Start Screen


Your screen should now look similar to the screens below (colors may vary). Notice that the longitude and latitude values at the bottom of the screen adjust as you move your cursor.

QGIS Map Canvas


You can now begin to explore some of the basic tools for navigating the map canvas such as:

  • Zoom in and out with the Fixed Zoom Tools
  • Fixed Zoom Tools
  • Pan tool to postion the map on the screen
  • Pan Tool
  • Map scale where you can adjust the map scale (level of zoom) manually
  • Map Scale
  • Zoom full button which will adjust the view to accommodate all datasets
  • Zoom Full


    Use these tools to zoom in and out, reposition the map on the screen, and return to the current view using the zoom full button.

    Another useful tool is the identify features cursor which uses an icon with a lowercase i in a blue circle Identify Features . This option extracts information for a selected feature from the attribute table of the selected layer.

    Identify Cursor


    If you wanted to view the entire underlying dataset you could control-click (macOS) or right-click (PC) on the active layer and select Open attribute table. In a future exercise you will learn how to sort, query, and edit information within the attribute table. For now, examine the numerous variables available for each state. What is the column name and range of values for the electric vehicle charging stations?
    Hint: This information will be needed in the next section.

    Attribute Table


    Question No. 1
    Using the identify tool, what is the population of Tennessee in 2010? What percentage of the population is female?

    Using the attribute table, find the area (SQMI ) of Colorado. How much larger is it than Wyoming?
    Type your answers in a word document or record the answers on a sheet of paper. They will need to be submitted at the conclusion of this lab.

    View Directions in R

    For each exercise in R you will need to load various packages that are used to complete analyses and graphical output. Generally these packages will be preloaded in the colab notebook however in subsequent labs you may need to install certain packages to complete the exercises. To install a package in R you use the following function where (“x”) is the name of a specific package.

    install.packages("x")

    Once the package has been installed, it will need to be loaded using a similar function:

    library("x")

    In the colab notebook for this exercise you will see where three packages tidyverse, maps, and ggsn were installed and loaded. Now that you have the packages required for the exercise you will need to add the data. For this lab the data consists of state names, abbreviations, and the number of electric vehicle recharing stations. To avoid the need to download the data, you will the read.csv() function and a URL to import the data. Using the head() function will allow you to view the first few lines of any dataset.

    evs <- read.csv('https://raw.githubusercontent.com/chrismgentry/GIS1-Exercise-2/main/Data/ev_stations.csv')
    head(evs)

    In the script above you will see the use of  <-. This is an operator used to create an object that can be used in later steps. If the script was written as read.csv('https://raw.githubusercontent.com/chrismgentry/GIS1-Exercise-2/main/Data/ev_stations.csv') the dataset would have been read and immediately displayed on the screen. However, it would not have been available for subsequent analyses. In colab, you can create your own code block and test it out to see the results. Since you need this information for later, it is important to use  <- to create an object out of the imported data. In this, and future exercises, you will see that operator used frequently to create objects for analysis.

    In order to create a map of electrical vehicle charging stations per state you need to obtain information for the states and create a new object. The tidyverse package is a retainer for a number of individual packages including the Grammar of Graphics or ggplot2 package. This package will frequently be used to display your data, but it also contains geographic information for the US. You can obtain that information by using the map_data function. In a new code block, you can use ?map_data to view help information on the function. Alternatively you can view the documentation for any package or function by searching the package or function name and cran (Comprehensive R Archive Network). For this function you would search map_data cran and the first link is likely to be the RDocumentation page for the function. Within this documentation you will find the arguments available for the function and example scripts. So to create an object that contains information for the continental US you can use:

    us <- map_data('state')

    Using the Grammar of Graphics package, ggplot2, you can create a visualization of this data. Here is that script:

    ggplot(us) + 
      geom_polygon(aes(x=long, y=lat, group=group), color = "white")

    In this script you identified that you wanted to use ggplot to visualize the us object you created in the previous step. Next ggplot needs to know what sort of object to draw. This is done by using the geom_ function followed by a type of geometry. They include point, line, polygon and other geometries such as histograms, boxplots, violin plots for statistics, or contours, rasters, and tiles for three dimensional data. So for this step you used geom_polygon. Next, ggplot needs to know how the data should be displayed. If you create a new code block and type str(us) you can see the structure of the data and a few of the variables. You will notice the dataset contains long (longitude), lat (latitude), group, order, region (the state names), and subregion. So in ggplot you provide a series of aesthetics using aes() to direct ggplot on how to display the data. In this case, x = long, y = lat, and you need to tell it to organize the groups by the category group. If you leave out group for this specific script, ggplot will be unsure what order to draw the polygons and your map will not appear correctly. color = "white" tells ggplot to use white borders for the individual states. What do you think would happen if you change the word color to fill ? In the resulting image, ggplot used the information from your aesthetics to draw the polygons and automate labels for the x and y axes. In a later step you will learn how to customize those labels.

    It may seem as if this is a difficult way to view the data. In other software with a graphical user interface (GUI) you would most likely click open, navigate to the folder containing the data, double-click on that dataset, and then it would appear on your screen. Essentially, every time you click “open” on a GUI interface, it is executing a specific set of scripts to 1.) open the navigation window, 2.) allow the selected file to be imported, and 3.) then display the information on your screen. What you did above in three lines of code was to tell ggplot to 1.) create an object called us and 2.) display it on the screen with some specific parameters. The benefit of completing this in R versus something with a GUI interface is, if you had three more similar datasets to view, you would simply change US to another object and re-run the script. To repeat the process in a GUI interface you would need to repeat all of the steps from the beginning. This might not seem like much for three steps, but what if your visualization had twenty steps, as many do? In R you would still simply change the dataset and run the same script, but in other programs you would need to repeat the same twenty steps for each dataset. Additionally, if a colleague wanted to display some data in the same way, you can either copy and paste the code or type out each of the twenty steps with directions. Which of these seems to be more consistently repeatable? Once you begin to understand the syntax (order or arrangement of words and phrases to form proper scripts) you will be more easily able to interpret sample scripts and fix errors in your own code.

    Question No. 1
    You used ggplot(us) + geom_polygon(aes(x=long, y=lat, group=group), color = "white") to create the visualization in this step. What script would you use to make the same map but with black borders and blue states? Add a code cell below this one, type the script, and run it to view the output. Hint: color = “…..”, fill = “…..”

    1.2 Step Two: The Analysis

    In this step you will organize and display the data in order to prepare it for the final visualization.

    View Directions in ArcGIS Pro

    Insert Text Here


    View Directions in QGIS

    Now that you have the data displayed on the screen and understand how to access the underlying data, you need to customize the view so you can see the spatial distribution of electric vehicle charging stations in the US. To begin you will need to control-click (macOS) or right-click (PC) on the ev_data in your layers and click on properties.

    Right Click Properties


    In the resulting window you will need to go to the Symbology tab (1.) in the left-hand menu. In this window you can change the fill of the polygons and change their opacity (or level of transparency). You can also adjust the symbology of your dataset. Those options are available in a selection bar at the top of the window. For this dataset they are:

  • No Symbol
  • No Symbols
  • Single Symbol
  • Single Symbol
  • Categorized
  • Categorized Symbols
  • Graduated
  • Graduated Symbols
  • Rule-based
  • Rule Based Symbols
  • Inverted Polygons
  • Inverted Polygon Symbols
  • 2.5 D
  • 2.5 D Symbols


    For this specific data you will choose Graduated (2.) since the data needs to be displayed by a range of numeric values. Next you will select the ev_station variable (3.) in the dataset. In the drop-down for the Color ramp option you have a number of color options to choose. For this example select Viridis (4.). Do you recall what the range of values for the ev_station data? Because the largest value is greater than 30,000 and the smallest value is around 100, you will need to set the Mode to a logarithmic scale (5.) to properly display the data while avoiding a bias of the larger values. Change the Classes to 6 (6.) and click OK (7.).

    Symbology Properties

    Because the macOS and PC versions are identical only one image is shown.


    Your screen should now look similar to this:

    Graduated Data

    At this point you should save your work. Whether using macOS or PC, on the menu bar go to Project > Save As… and save your project in the folder you create for this exercise.

    Question No. 2
    In this step you used a Graduated symbology to visualize the data and organized the values logarithmically. There were several other options within mode menu.

  • Equal Count (Quantile)
  • Equal Count
  • Equal Interval
  • Equal Interval
  • Logarithmic Scale
  • Logarithmic Scale
  • Natural Breaks (Jenks)
  • Natural BReaks (Jenks)
  • Pretty Breaks
  • Pretty Breaks
  • Standard Deviation
  • Standard Deviation
    Adjusting the mode value, describe how the visualization changes with each of these different options.
    Record your answers and submit at the conclusion of the lab.

    View Directions in R

    Now that you have datasets for electric vehicle charging stations (object = evs) and the continental US (object = us), you need to combined that data to allow for the states to be color coded based on the number of charging stations per state. To do this you will use a function called merge from the base R functions that will allow you to combine the information from the evs and us into a single dataset that contains information from both based on a common variable. So to start you will need to determine what variable(s) are contained within each dataset. You have seen how to examine datasets using both head() and str() already in this exercise. Create a new code block and examine the structure of each dataset. You will see that there is a column for state name in each except they are labeled differently. This is important information you will need to properly merge the datasets.

    To do this you will first create an object (<-) with a new name, then with the merge function set the following arguments:

    • x, which is the first data set
    • y, which is the second dataset
    • by.x, identifies the column to use for the merge in x
    • by.y, identifies the column to use for the merge in y
    • all = TRUE, which tells the function to retain all data

    So your final script will be:

    states <- merge(x = us, y = evs, by.x = "region", by.y = "state", all = TRUE)
    head(states)

    Now you will see the columns for evs and abbreviation included in the us dataset. This new dataset will be what you use to visualize the information in the next step.

    Question No. 2
    Using sum(states) in a new code cell, what are the largest and smallest number of electric vehicle charging stations?

    1.3 Step Three: The Visualization

    You will learn how to create a graphical display of your data that includes cartographic elements such as legend, scale bar, north arrow, etc.

    View directions in ArcGIS Pro

    Insert Text Here


    View directions in QGIS

    Now it’s time to turn your data into a map. From the menu bar in either macOS or PC click Program > New Print Layout. This will open a new window where you will add the data, title, legend, north arrow, scale bar, and your name and date.

    New Print Layout


    View directions in R

    With this new dataset you are now ready to create a map to examine the distribution of electric vehicle charging stations across the country. In the step one you used a very simple script to display the us data.

    ggplot(us) + 
      geom_polygon(aes(x=long, y=lat, group=group), color = "white")

    A similar script would allow you to quickly visualize the data ggplot(states) + geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white"), however, you need to add a number of elements in order to create a map such as a scale bar, north arrow, title, etc. Additionally, you can customize other components to provide a better overall visualization.

    Earlier in the notebook we installed the ggsn package. This package allows you to add “north symbols and scale bars for maps created with ‘ggplot2’ or ‘ggmap’.” So you can build on the script above to create a map of the information in the states dataset. To begin, run the script above with the added fill argument to see the outcome:

    ggplot(states) + geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white")

    One thing you will notice is that the categories are very difficult to distinguish. Because you have wide ranging data in the evs_count column only the largest value is showing. A simple fix to this would be to take the common logarithm of the data to standardize the data by removing the skewness towards large values. The function scale-viridis scales the data and provides a color map designed to be perceived by viewers with common forms of color blindness. So you can add scale_fill_viridis_c(option = "D", trans = "log10") to the script above where:

    • scale_fill_viridis_c is a fill pattern for continuous data
    • option = “D” is the default color option
      • There are five color options available with this function
        • A = magma
        • B = inferno
        • C = plasma
        • D = viridis
        • E = cividis
    • trans = “log10” transforms the data using common logarithm, other options are available; see documentation

    The new script should now look like this:

    ggplot(states) + 
      geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
      scale_fill_viridis_c(option = "D", trans = "log10")

    Now that you are able to visualize the separations in the data you can add additional information. You can start with customizing the labels, map title, and legend title. This can all be completed by adding a single line of code containing all of the text information for those items:

    ggplot(states) + 
        geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
      scale_fill_viridis_c(option = "D", trans = "log10") + 
      labs(x="Longitude",y="Latitude", title="Number of Electric Vehicle Charging Stations Per State", fill = "No. of Stations")

    Feel free to edit the label names and the color option in the script to provide your own customizations. Next you need to add a scale bar and north arrow. To view the available options for the north arrow type northSymbols() into a new code block. The numeric values below each symbol will be used in the script to identify the specific style you choose. Because the north arrow, north, is specifically related to the map data you need to provide the following arguments:

    • dataset
    • symbol, identified by the numerical value from northSymbols()
    • location, indicating where to base the location on the map
    • anchor, coordinates for the symbol position on the map (based off the location)
    • scale, the symbol size as a proportion of the map size

    So your new script will look like:

    ggplot(states) + 
      geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
      scale_fill_viridis_c(option = "D", trans = "log10") + 
      labs(x="Longitude",y="Latitude", title="Number of Electric Vehicle Charging Stations Per State", fill = "No. of Stations") +
      north(states, location = "bottomleft", scale = 0.05, symbol = 12, anchor = c(x= -70, y= 25))

    In this example, location = "bottomleft" means the location of the north arrow will be based from the bottom left of the symbol and ``anchor = c(x = -70, y = 25)``` is the geographic location on the map to draw the symbol. For example, if the anchors were set at -100 and 40 the symbol would be draw on the Nebraska/Kansas border. Feel free to adjust the anchor points to draw the north arrow in your preferred location.

    Now you need to add a scale bar. Many of the arguments used for the north arrow are duplicated for scalebar

    • dataset
    • location, indicating where to base the location on the map
    • anchor, coordinates for the symbol position on the map (based off the location)
    • distance for each unit of the scale bar
    • unit of measurement such as mi, km, etc.
    • transform (TRUE/FALSE), assumes the coordinates are in decimal degrees
    • model, choice of ellipsoid; which will be discussed later in the semester
    • st.size, scale bar size
    • st.dist, distance between the scale bar and the scale bar’s text, as a proportion of the y axis
    ggplot(states) + 
      geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
      scale_fill_viridis_c(option = "D", trans = "log10") + 
      labs(x="Longitude",y="Latitude", title="Number of Electric Vehicle Charging Stations Per State", fill = "No. of Stations") +
      north(states, location = "bottomleft", scale = 0.05, symbol = 12, anchor = c(x= -70, y= 25)) + 
      scalebar(states, dist = 250, dist_unit = "mi", transform = TRUE, model = "WGS84", location = "bottomleft", st.dist = 0.05, st.size = 2, anchor = c(x=-125,y=27))

    As with all of the other customizations above, feel free to adjust the units, distance, text distance, and size based on your own style.

    Finally, you will need to add text to the map to indicate the name of the person who created the map and the date. In the future you will possibly include references or other text based information. There are a number of different ways you will explore for adding text information to your maps, such as caption = in labs, but for this example you will use annotate(). Similar to the north arrow and scale bar, there will be a

    • x and y argument to set the location
    • size to indicate the font size
    • label for the text you wish to include; to create a character return to move text to a new line you should use “\n” where you want the text to move to a new line

    Your final script should now look similar to this:

    ggplot(states) + 
      geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
      scale_fill_viridis_c(option = "D", trans = "log10") + 
      labs(x="Longitude",y="Latitude", title="Number of Electric Vehicle Charging Stations Per State", fill = "No. of Stations") +
      north(states, location = "bottomleft", scale = 0.05, symbol = 12, anchor = c(x= -70, y= 25)) + 
      scalebar(states, dist = 250, dist_unit = "mi", transform = TRUE, model = "WGS84", location = "bottomleft", st.dist = 0.05, st.size = 2, anchor = c(x=-125,y=27)) +
      annotate("text", x = -90, y = 25, label = "Your Name \n The Date", size = 2)

    There is an exhaustive amount of modifications that can be applied to the map above, but for now you have the minimum information required to create a map of similar kinds of data. In future exercises you will use various function to customize the look of your maps.

    Question No. 3
    How does the dist =  argument in the scalebar function relate specifically to the distance of the scale bar on your map? How would changing the value alter the appearance?

    1.4 Step Four: Your Turn

    After a rash of severe weather over the past few years the Montgomery County Emergency Management Agency has asked you to provide a map detailing the number of reported tornadoes in each Tennessee county over the past several decades. This information will be shared with neighboring counties in middle Tennessee as a part of severe weather education campaign designed to inform communities about the risk of tornadoes in the region. The map should include all of the elements included on your previous map of electric vehicle charging stations such as:

    • Title
    • Scale
    • North Arrow
    • Legend
    • Name/Date of Cartographer

    Software specific directions can be found for each step below. Please submit the answer to the questions you answered above as well as your final tornado map by the due date.

    View directions in ArcGIS Pro

    Insert Text Here


    View directions in QGIS

    Insert Text Here


    View directions in R

    This portion of the exercise is meant to reinforce the skills you learned in the part of the lab. The steps you will take to complete your final map will be to:

    1. Create an object from the tornadoes dataset
    2. Obtain a dataset of Tennessee counties
    3. Determine which columns can be used to merge the datasets
    4. Map out the data using ggplot

    To begin the exercise you will need this URL to the comma delimited dataset:

    https://raw.githubusercontent.com/chrismgentry/GIS1-Exercise-2/main/Data/tn_tornadoes.csv

    This data represents the number of reported tornadoes in each county from 1950-2020. As in step one of the exercise, you can use the <- operator to create a new object and the read.csv() function with the link to the dataset to import the data. Remember you can use head() or str() to examine the information.

    To obtain county information for the State of Tennessee you should use the following script:

    tn <- map_data('county', region = "tennessee")

    If you search for map_data in the ggplot2 documentation you will find an example of a script used to isolate information for the State of Iowa that can be adapted for any state in the dataset. Remember when working with scripts, Google is your friend! All it requires is asking the correct question to find some example code online that can help guide you. There are numerous possible answers to the same problem so don’t hesitate to try other methods.

    Using the example code from step two, you will need to merge the two datasets into a new object based on a common variable such as Hint…hint county name.

    Finally, by adapting the ggplot code in step four, you can map the information for tornado_count for each county in Tennessee. In order to limit the scope of your map to just the state you should add the following script to your modified ggplot code from above:

    coord_fixed(xlim = c(-90,-82), ylim = c(35, 37))

    The coord_fixed() function limits the axes based on specified values. The values are based on the units of measure of the data. In this can you see where the x axis is limited (xlim) from -82° to -90° west longitude and the y axis is limited (ylim) from 35° to 37° north latitude. If you omit this script your map will likely have a “smashed” appearance. For creating maps in R it is generally advisable to set the x and y coordinates to ensure proper display of your data. Remember you will need to adjust the anchor points of your name and date text, north arrow and scale bar to fit the current map view. Additionally, the title and legend text should also reflect the information depicted on your map.

    Question No. 4
    Which county had the highest number of reported tornadoes?
    Type subset(tn_tornadoes, tornado_count == max(tornado_count)) into a new code cell or use Google to search for a county map of Tennessee to determine county locations on your map.
    Hint: Replace tn_tornadoes in the code above with the object you created by merging the tornado and counties datasets.

    2 The Write-Up

    The Montgomery County Emergency Management Agency has asked you to provide a map detailing the number of reported tornadoes in each Tennessee county over the past several decades. Based on the map you create above, complete a lab write-up that addresses the following questions:

    • Provide the names of the five (5) counties that recorded the most tornadoes during that time frame
    • Describe which regions of Tennessee had the fewest reported tornadoes
    • Inform MCEMA which metropolitan regions could be most impacted by future severe weather events

    When complete, send link to your Colab Notebook via email.

    ---
title: "Exercise 2: Getting to know... <br><small>Geographic Information Systems 1 Lab</small></br>"
author: "GEOG 3150"
output:
  html_notebook:
    df_print: paged
    rows.print: 10
    theme: cosmo
    highlight: breezedark
    number_sections: yes
    toc: yes
    toc_float:
      collapsed: no
      smooth_scroll: yes
  pdf_document: default
  html_document:
    toc: yes
    df_print: paged
editor_options:
  chunk_output_type: inline
  mode: gfm
---

```{=html}
<style type="text/css">

h1.title {
  font-size: 40px;
  font-family: "Times New Roman", Times, serif;
  color: DarkBlue;
  text-align: center;
}
h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 20px;
  font-family: "Times New Roman", Times, serif;
  color: DarkBlue;
  text-align: center;
}

.zoom {
  transition: transform .2s; /* Animation */
  margin: 0 auto;
}
.zoom img{
	width:100%;
	height:auto;	
}
.zoom:hover {
  transform: scale(2);
}

</style>
```

<hr></hr>

For this first exercise you will be examining how you can use various software for geospatial analysis and graphical display of spatial data. Because of the variety and availability of operating systems in use today, each of the exercises in this course can be completed in either **[ArcGIS Pro]{style="color:#ff4500"}**, **[QGIS]{style="color: #006400"}**, or **[R]{style="color: #6495ED"}**. Each of these software have their pro's and con's but for simplicity purposes here is a table that might suggest which software would be best for you:

| [ArcGIS Pro]{style="color:#ff4500"} | [QGIS]{style="color: #006400"} | [R]{style="color: #6495ED"} |
| --- | --- | --- |
| Available on <u>Windows PC Only</u> | Available for PC, Mac, or Linux (limited) | Available for PC, Mac, Linux, ChromeOS, iOS, or Android via web modern browser|
| Available by subscription only | Open Source | Open Source |
| Graphical User Interface with built-in functionality for Python; add-ons available for statistical languages | Graphical User Interface with built-in functionality for Python; use of plug-ins for other languages and analyses | Scripting language, mostly used for statistical purpose; packages expand its use to other applications such as geospatial analyses |
| Widely used in academia, government, and industry jobs | Widely used in academia and industry | Predominantly used in academia, research, and specialized industry jobs |
Help available at [ESRI.com](https://pro.arcgis.com/en/pro-app/latest/help/main/welcome-to-the-arcgis-pro-app-help.htm) and through various outlets via a google search | Help available at [QGIS.org](https://docs.qgis.org/3.16/en/docs/user_manual/) and through sites like stackoverflow.com or other sites via google search | Help available through individual package [documentation](https://cran.r-project.org/web/packages/available_packages_by_name.html) and sites like stackoverflow.com or other sites via google search

For the purposes of this course, you can complete the laboratory exercises in any of these software. There will be drop-down menus in every lab color-coded for the three software. Ultimately, you will have access to all of this material during and after the course and while the steps might be different between each version the outcomes will be the same. So if you choose to complete the course with one software you can always go back and complete the work in another software in your own time.

While all of the software will be available on the computers in McCord 210, if you choose to use **[ArcGIS Pro]{style="color:#ff4500"}** you will need to contact the APSU GIS Center with your student information to obtain a one year license for your personal computer. **[QGIS]{style="color: #006400"}** is available for download using this **[link](https://qgis.org/en/site/forusers/download.html){style="color: #006400"}**. If you are completing the course in **[R]{style="color: #6495ED"}** you will not need to install any software, however you will need a [Google account](https://accounts.google.com/signup/v2/webcreateaccount) and access to the [Chrome browser](https://www.google.com/chrome/). When using **[R]{style="color: #6495ED"}** you will be completing all of the lab exercises in the [Google Colaboratory](https://colab.research.google.com/) and will access everything you need from the exercise pages on _GitHub_.

More information for each software and the steps to complete this exercise can be found below. Be sure to follow the color-coded drop-down for the specific software you are using for the lab. 

<details>
<summary><big>View information for <b> [ArcGIS Pro]{style="color:#ff4500"} </b></big></summary>

<b>[ArcGIS Pro]{style="color:#ff4500"}</b> is a proprietary software from [ESRI](https://www.esri.com/en-us/home). Student access is available in McCord 210 and several other computer labs on campus. If you wish to obtain the software for use on your personal computer (Windows-based PC only) you will need to contact the [APSU GIS Center](https://www.apsugis.org/) at 601 N 2nd St, Clarksville, TN 37040, (931) 221-7500. Please have your APSU student ID and A# available for verification prior to obtaining a license. Please contact the GIS Center if you have any issues downloading or installing the software.

</details>
<hr></hr>

<details>
<summary><big>View information for <b> [QGIS]{style="color: #006400"} </b></big></summary>

If you will be using <b>[QGIS]{style="color: #006400"}</b> for the exercises you will need to have the software installed on your computer. **QGIS** is available for PCs, Mac, and Linux computers. You can find the appropriate download for your operating system here:

- [PC version, 3.16](https://qgis.org/downloads/QGIS-OSGeo4W-3.16.6-1-Setup-x86_64.exe)
- [macOS version, 3.16](https://qgis.org/downloads/macos/qgis-macos-ltr.dmg)
- [Linux version, 3.x](https://qgis.org/en/site/forusers/alldownloads.html#linux)

When installing on _macOS_ for the first time you may need to go to _Systems Preferences > Security and Privacy_ and on the _General_ tab "Allow apps downloaded from" _App Store and identified developers_ and click _Open Anyway_. Please contact your instructor if you have any issues downloading or installing the software.

</details>
<hr></hr>

<details><summary><big>View information for <b> [R]{style="color: #6495ED"} </b></span></big></summary>
You will be completing the <b>[R]{style="color: #6495ED"}</b> portion of each exercise using the _[Google Colaboratory Executable Environment](https://colab.research.google.com/)_. Colaboratory, or _"colab"_ for short, is based on the popular Jupyter Notebooks and allows you to write and execute **R** or _Python_ in your browser, with:

- Zero configuration required
- Free access to GPUs
- Easy sharing

Each _colab_ notebook is separated into code cells and text cells. A code cell is used to write and execute a script interactively. Each text cell uses simple [markdown](https://www.markdownguide.org/) syntax for creating plain text information. You can easily share the notebook like other Google Drive based documents by clicking the "Share" link in the upper right-hand portion of the window.

Before beginning each exercise using **R**, you will need to open the [Exercise Colab Notebook](https://githubtocolab.com/chrismgentry/GIS-Exercise-2/blob/main/GIS1_EX2.ipynb) and alter the URL to view/edit the file in colab. To do this you will navigate to the exercise's GitHub page; in this case [https://github.com/chrismgentry/GIS1-Exercise-2](https://github.com/chrismgentry/GIS1-Exercise-2). In the list of folders and files you will find a file that will always be identified as _GIS1_EX_ followed by the exercise number and a _.ipynb_ file extension. So for this exercise it will be [GIS1_EX2.ipynb](https://github.com/chrismgentry/GIS1-Exercise-2/blob/main/GIS1_EX2.ipynb)

<p align="center"><img src= "Images/r-github-ipy-location.png" alt="Finding the Google Colab file on GitHub" style="width:100%"></p>
<br>
When you locate the file, click on the name to open the _ipy_ notebook. In the resulting window, click on the URL and insert **tocolab** immediately after _github_ in the address. So the new URL for this exercise should now read:<br> ```https://githubtocolab.com/chrismgentry/GIS-Exercise-2/blob/main/GIS1_EX2.ipynb``` 

<p align="center"><img src= "Images/r-github-tocolab.png" alt="Converting ipy file to colab" style="width:100%"></p>
<br>
Click enter/return and the file should now open inside of _Colab_. Your screen might have a slightly different appearance, but you should see the Exercise 2, GIS 1 header indicating you are in the correct notebook.

<p align="center"><img src= "Images/r-colab.png" alt="Converting ipy file to colab" style="width:100%"></p>
<br>
Each time you open these colab notebooks, or if you have not interacted with the notebook for an extended period of time, you will need to make sure the environment is connected and all of the sample script has been run. To do this go to _Runtime > Run all_ (or click CRTL+F9) and run the notebook. This may take a moment to complete so be patient until the last code cell has been executed. You should see a green check mark with an allocation of RAM and Disk as shown by horizontal bars which will indicate you are connected. In each section of code there is a **Run cell** &nbsp;<img src= "Images/r-colab-run-code.png" alt="Run cell button" width="15" height="15"> button that will allow you to run the individual code cells. This button will have a rotating loading symbol <img src= "Images/r-colab-loading.png" alt="Run cell button" width="15" height="15"> while the code cell is being executed. Once it is complete the box will return to the run state and there may be an output visible depending on the script. You can add your own text or code cells to a notebook simply by moving your cursor slowly over the notebook to reveal the code/text option bar or by going to _Insert > Code cell/Text cell_.

<p align="center"><img src= "Images/r-colab-code-text.png" alt="Adding Text or Code Blocks" style="width:100%"></p>
<br>
By clicking within each code or text cell you can edit the contains and adjust the properties using the various preset option controls.

<p align="center"><img src= "Images/r-colab-block-settings.png" alt="Edit block properties" style="width:100%"></p>
<br>
You should practice adding and removing cells, editing their content, and rearranging the order of the cells. This will help familiarize you with notebook tools and allow you to manipulate notebooks in these exercises and create your own notebooks in the future.

For each new exercise there will be a colab notebook available. Within the notebook will be sample code for you to run, code cells for you to complete as part of the exercise and questions to be answered within labeled text cells. Detailed directions will be available on the individual exercise webpage. Each completed exercise will have completed code and question sections and the notebook link shared for a grade.

</details>

# The Introduction

The Tennessee Valley Authority (TVA) and Tennessee Department of Environment and Conservation (TDEC) are considering a partnership to increase the number of electric vehicle charging stations across the state. With destinations such as Nashville, Memphis, Gatlinburg and the Great Smoky Mountains, Bristol Motor Speedway, Signal Mountain and the Tennessee Aquarium, tourism is a major part of the state's economy. So accommodating residents and visitors with electric vehicles as well as creating a network for those traveling the states major interstates (I-65, I-40, I-24, I-75, etc.) is of utmost importance. 

TDEC says this network of charging stations will also promote electric vehicle growth by giving drivers confidence they will have easy access to refueling while away from home, eliminating so-called “range anxiety” that keeps many consumers from considering electric vehicles as viable transportation. TVA says electric vehicle adoption will spur jobs and economic investment in the region, keep refueling dollars in the local economy, reduce the regions largest source of carbon emissions, and save drivers and fleets money.

So TVA and TDEC have asked you to develop a map showing the number of public electric vehicle charging stations in the continental United States. This will help them understand what the nations current infrastructure looks like and how Tennessee currently compares to neighboring states. It will also help TVA and TDEC determine approximately how many additional stations they should add. 

In this exercise you will:

-   Navigate software used for geospatial analysis
-   Examine attributes of various datasets
-   Learn how to graphically display spatial data
-   Create a cartographically sound product

Software specific directions can be found for each step below. Please submit the answer to the questions and your final map by the due date.

## Step One: The Data

_Insert Text Here_

<details>
<summary><big>View Directions in <b> [ArcGIS Pro]{style="color:#ff4500"} </b></big></summary>

_Insert Text Here_

Question No. 1
<blockquote>
_Insert Text Here_
</blockquote>

</details>
<hr></hr>

<details>
<summary><big>View Directions in <b> [QGIS]{style="color:#006400"} </b></big></summary>

The purpose of this exercise is to help familiarize you with the [QGIS]{style="color:#006400"} program. With this software, you will have the ability to view, create, and edit geographic data, query spatial data, examine spatial relationships, and create publishable maps. In these directions I will post images from both PC (on the left) and macOS (on the right) for each step. While your set-up may differ slightly, the configurations and layout of the tools should be similar.

When you begin a new exercise it would be beneficial to create a new folder specifically for the data and project files associated with the lab. This will help you to organize your data and be able to quickly refer back to it in later exercises. For this exercise you will need to download the _Electric Vehicle Charging Station_ data from the [GitHub Repository](https://github.com/chrismgentry/GIS1-Exercise-2){style="color:#000000"} by clicking on the download button at <b>[this link](https://github.com/chrismgentry/GIS1-Exercise-2/blob/main/Data/Shapefiles/ev_data.zip){style="color:#006400"}</b> and saving it in your exercise folder. Once the download is complete you will need to unzip the file by using _Control+click_ on macOS or _right-click > extract_ all on PC.

<p align="center"><img src= "Images/github-data-download.png" alt="Data Download from GitHub" style="width:100%"></p><br>

To start the QGIS program press the _Start Menu > QGIS_ or launch the application from the _Mac Launchpad_. It might be beneficial to create a shortcut on the desktop or add it to your dock for quicker access in the future. As the program loads, the opening screen will be extremely useful to you while working on future projects, however, for this exercise we will select **New Empty Project**.

<p align="center"><img src= "Images/qgis-temp.png" alt="QGIS Start Screen" style="width:100%"></p><br>

The screen depicted below is separated into two sections. To the left there is the _Browser_ and _Layers_ sections, and on the right is the _Map Canvas_. Your version may vary slightly from the images. There are a number of ways to add data, but in this example we are going to use the browser window to connect to our ESRI Data folder.

<p align="center"><img src= "Images/qgis-proj.png" alt="QGIS Start Screen" style="width:100%"></p><br>

In the _Browser_ window on the left, navigate to the folder where you saved and unzipped the exercise data folder. Once you locate a folder in folder in the browser, you can use control+click (macOS) or right-click (PC) to **Add as a favorite**. This will link it to the favorites drop-down in the browser window giving you quicker future access. To add the data, you will navigate to the **ev_data.shp** file and then click-and-drag it to the map canvas. You will be able to add all types of data in this same way. If there is a pop-up window that mentions transformations go ahead and click OK. While the look of the US in this dataset is less than ideal, the purpose of this exercise is to ensure you know how to display data and basic tools for navigating the software. Transformations will be discussed further in a couple weeks after a lecture on projections and coordinate systems. For now, leave it with option 1 and click OK.

<p align="center"><img src= "Images/qgis-transformations.png" alt="QGIS Start Screen" style="width:100%"></p><br>

Your screen should now look similar to the screens below (colors may vary). Notice that the longitude and latitude values at the bottom of the screen adjust as you move your cursor. 

<p align="center"><img src= "Images/qgis-states.png" alt="QGIS Map Canvas" style="width:100%"></p><br>

You can now begin to explore some of the basic tools for navigating the _map canvas_ such as:

<table style="width:100%;margin-left:auto;margin-right:auto;">
  <tr>
    <td><li>Zoom in and out with the <i>Fixed Zoom Tools</i></li></td>
    <td><img src= "Images/qgis-fixed-zoom.png" alt="Fixed Zoom Tools" width="40" height="20"></td>
  </tr>
  <tr>
    <td><li><i>Pan tool</i> to postion the map on the screen</li></td>
    <td><img src= "Images/qgis-pan.png" alt="Pan Tool" width="20" height="20"></td>
  </tr>
  <tr>
    <td><li><i>Map scale</i> where you can adjust the map scale (level of zoom) manually</li></td>
    <td><img src= "Images/qgis-map-scale.png" alt="Map Scale" width="80" height="20"></td>
  </tr>
  <tr>
    <td><li><i>Zoom full</i> button which will adjust the view to accommodate all datasets</li></td>
    <td><img src= "Images/qgis-full-extent.png" alt="Zoom Full" width="20" height="20"></td>
  </tr>
</table>
<br>
Use these tools to zoom in and out, reposition the map on the screen, and return to the current view using the _zoom full_ button.

Another useful tool is the **identify features** cursor which uses an icon with a lowercase _i_ in a [blue circle]{style="color:#6495ED"} <img src= "Images/qgis-identify.png" alt="Identify Features" width="20" height="20"> . This option extracts information for a selected feature from the attribute table of the selected layer.

<p align="center"><img src= "Images/qgis-tennessee.png" alt="Identify Cursor" style="width:100%"></p><br>

If you wanted to view the entire underlying dataset you could control-click (macOS) or right-click (PC) on the active layer and select **Open attribute table**. In a future exercise you will learn how to sort, query, and edit information within the attribute table. For now, examine the numerous variables available for each state. What is the column name and range of values for the electric vehicle charging stations? <br>
<small><i>Hint: This information will be needed in the next section.</i></small>

<p align="center"><img src= "Images/qgis-attributes.png" alt="Attribute Table" style="width:100%"></p><br>

<b><big>Question No. 1</big></b>
<blockquote>
Using the identify tool, what is the population of Tennessee in 2010? What percentage of the population is female?<br><br>
Using the attribute table, find the area (_SQMI_&nbsp;) of Colorado. How much larger is it than Wyoming?
</blockquote>

Type your answers in a word document or record the answers on a sheet of paper. They will need to be submitted at the conclusion of this lab.
</details>
<hr></hr>

<details>
<summary><big>View Directions in <b> [R]{style="color:#6495ED"} </b></big></summary>

For each exercise in **R** you will need to load various packages that are used to complete analyses and graphical output. Generally these packages will be preloaded in the colab notebook however in subsequent labs you may need to install certain packages to complete the exercises. To install a package in **R** you use the following function where ("x") is the name of a specific package.

```
install.packages("x")
```
Once the package has been installed, it will need to be loaded using a similar function:

```
library("x")
```
In the colab notebook for this exercise you will see where three packages _tidyverse, maps, and ggsn_ were installed and loaded. Now that you have the packages required for the exercise you will need to add the data. For this lab the data consists of state names, abbreviations, and the number of electric vehicle recharing stations. To avoid the need to download the data, you will the ```read.csv()``` function and a URL to import the data. Using the ```head()``` function will allow you to view the first few lines of any dataset.

```{r data, echo=TRUE, message=FALSE, warning=FALSE}
evs <- read.csv('https://raw.githubusercontent.com/chrismgentry/GIS1-Exercise-2/main/Data/ev_stations.csv')
head(evs)
```

In the script above you will see the use of &nbsp;**<-**. This is an operator used to create an object that can be used in later steps. If the script was written as ```read.csv('https://raw.githubusercontent.com/chrismgentry/GIS1-Exercise-2/main/Data/ev_stations.csv')``` the dataset would have been read and immediately displayed on the screen. However, it would not have been available for subsequent analyses. In colab, you can create your own code block and test it out to see the results. Since you need this information for later, it is important to use &nbsp;**<-** to create an object out of the imported data. In this, and future exercises, you will see that operator used frequently to create objects for analysis. 

In order to create a map of electrical vehicle charging stations per state you need to obtain information for the states and create a new object. The ```tidyverse``` package is a retainer for a number of individual packages including the _Grammar of Graphics_ or _ggplot2_ package. This package will frequently be used to display your data, but it also contains geographic information for the US. You can obtain that information by using the ```map_data``` function. In a new code block, you can use ```?map_data``` to view help information on the function. Alternatively you can view the documentation for any package or function by searching the package or function name and cran _(Comprehensive R Archive Network)_. For this function you would search ```map_data cran``` and the first link is likely to be the [RDocumentation page](https://www.rdocumentation.org/packages/ggplot2/versions/3.3.3/topics/map_data) for the function. Within this documentation you will find the arguments available for the function and example scripts. So to create an object that contains information for the continental US you can use:

```{r states, echo=TRUE, message=FALSE, warning=FALSE}
us <- map_data('state')
```

Using the _Grammar of Graphics_ package, ```ggplot2```, you can create a visualization of this data. Here is that script:

```{r states map, echo=TRUE, message=FALSE, warning=FALSE}
ggplot(us) + 
  geom_polygon(aes(x=long, y=lat, group=group), color = "white")
```
In this script you identified that you wanted to use ```ggplot``` to visualize the **us** object you created in the previous step. Next ```ggplot``` needs to know what sort of object to draw. This is done by using the ```geom_``` function followed by a type of geometry. They include point, line, polygon and other geometries such as histograms, boxplots, violin plots for statistics, or contours, rasters, and tiles for three dimensional data. So for this step you used ```geom_polygon```. Next, ```ggplot``` needs to know how the data should be displayed. If you create a new code block and type ```str(us)``` you can see the structure of the data and a few of the variables. You will notice the dataset contains _long_ (longitude), _lat_ (latitude), _group_, _order_, _region_ (the state names), and _subregion_. So in ```ggplot``` you provide a series of aesthetics using ```aes()``` to direct ```ggplot``` on how to display the data. In this case, ```x = long, y = lat```, and you need to tell it to organize the groups by the category group. If you leave out _group_ for this specific script, ```ggplot``` will be unsure what order to draw the polygons and your map will not appear correctly. ```color = "white"``` tells ```ggplot``` to use white borders for the individual states. What do you think would happen if you change the word **_color_** to **_fill_**&nbsp;? In the resulting image, ```ggplot``` used the information from your aesthetics to draw the polygons and automate labels for the x and y axes. In a later step you will learn how to customize those labels.

It may seem as if this is a difficult way to view the data. In other software with a graphical user interface (GUI) you would most likely click open, navigate to the folder containing the data, double-click on that dataset, and then it would appear on your screen. Essentially, every time you click "open" on a GUI interface, it is executing a specific set of scripts to 1.) open the navigation window, 2.) allow the selected file to be imported, and 3.) then display the information on your screen. What you did above in three lines of code was to tell ```ggplot``` to 1.) create an object called _us_ and 2.) display it on the screen with some specific parameters. The benefit of completing this in **R** versus something with a GUI interface is, if you had three more similar datasets to view, you would simply change _US_ to another object and re-run the script. To repeat the process in a GUI interface you would need to repeat all of the steps from the beginning. This might not seem like much for three steps, but what if your visualization had twenty steps, as many do? In **R** you would still simply change the dataset and run the same script, but in other programs you would need to repeat the same twenty steps for each dataset. Additionally, if a colleague wanted to display some data in the same way, you can either copy and paste the code or type out each of the twenty steps with directions. Which of these seems to be more consistently repeatable? Once you begin to understand the syntax (order or arrangement of words and phrases to form proper scripts) you will be more easily able to interpret sample scripts and fix errors in your own code.

<big><b>Question No. 1</b></big>
<blockquote>
You used ```ggplot(us) + geom_polygon(aes(x=long, y=lat, group=group), color = "white")``` to create the visualization in this step. What script would you use to make the same map but with _black_ borders and _blue_ states? Add a code cell below this one, type the script, and run it to view the output.
<small>Hint: color = ".....", fill = "....."</small>
</blockquote>

</details>

## Step Two: The Analysis

In this step you will organize and display the data in order to prepare it for the final visualization. 

<details>
<summary><big>View Directions in <b> [ArcGIS Pro]{style="color:#ff4500"} </b></big></summary>

_Insert Text Here_

</details>
<hr></hr>

<details>
<summary><big>View Directions in <b> [QGIS]{style="color:#006400"} </b></big></summary>

Now that you have the data displayed on the screen and understand how to access the underlying data, you need to customize the view so you can see the spatial distribution of electric vehicle charging stations in the US. To begin you will need to control-click (macOS) or right-click (PC) on the **ev_data** in your _layers_ and click on properties.

<p align="center"><img src= "Images/qgis-rcproperties.png" alt="Right Click Properties" style="width:60%"></p><br>

In the resulting window you will need to go to the **Symbology** tab (<i>1.</i>) in the left-hand menu. In this window you can change the fill of the polygons and change their opacity (or level of transparency). You can also adjust the symbology of your dataset. Those options are available in a selection bar at the top of the window. For this dataset they are:

<table style="width:50%;margin-left:auto;margin-right:auto;">
  <tr>
    <td><li>No Symbol</li></td>
    <td><img src= "Images/qgis-no-symbols.png" alt="No Symbols" width="20" height="20"></td>
  </tr>
  <tr>
    <td><li>Single Symbol</li></td>
    <td><img src= "Images/qgis-single-symbol.png" alt="Single Symbol" width="20" height="20"></td>
  </tr>
  <tr>
    <td><li>Categorized</li></td>
    <td><img src= "Images/qgis-categorized-symbol.png" alt="Categorized Symbols" width="20" height="20"></td>
  </tr>
  <tr>
    <td><li>Graduated</li></td>
    <td><img src= "Images/qgis-graduated-symbol.png" alt="Graduated Symbols" width="20" height="20"></td>
  </tr>
  <tr>
    <td><li>Rule-based</li></td>
    <td><img src= "Images/qgis-rule-based-symbol.png" alt="Rule Based Symbols" width="20" height="20"></td>
  </tr>
  <tr>
    <td><li>Inverted Polygons</li></td>
    <td><img src= "Images/qgis-invpoly-symbol.png" alt="Inverted Polygon Symbols" width="20" height="20"></td>
  </tr>
  <tr>
    <td><li>2.5 D</li></td>
    <td><img src= "Images/qgis-25D-symbol.png" alt="2.5 D Symbols" width="20" height="20"></td>
  </tr>
</table>
<br>
For this specific data you will choose _Graduated_ (<i>2.</i>) since the data needs to be displayed by a range of numeric values. Next you will select the **ev_station** variable (<i>3.</i>) in the dataset. In the drop-down for the _Color ramp_ option you have a number of color options to choose. For this example select **Viridis** (<i>4.</i>).  Do you recall what the range of values for the **ev_station** data? Because the largest value is greater than 30,000 and the smallest value is around 100, you will need to set the _Mode_ to a _logarithmic scale_ (<i>5.</i>) to properly display the data while avoiding a bias of the larger values. Change the _Classes_ to 6 (<i>6.</i>) and click **OK** (<i>7.</i>).

<p align="center"><img src= "Images/qgis-symbology-options.png" alt="Symbology Properties" style="width:100%"></p>
<p align="center"><small><b>Because the macOS and PC versions are identical only one image is shown.</b></small></p><br>

Your screen should now look similar to this:

<p align="center"><img src= "Images/qgis-viridis.png" alt="Graduated Data" style="width:100%"></p>

At this point you should save your work. Whether using macOS or PC, on the menu bar go to _Project > Save As..._ and save your project in the folder you create for this exercise.

<big><b>Question No. 2</b></big><br>
In this step you used a _Graduated_ symbology to visualize the data and organized the values _logarithmically_. There were several other options within _mode_ menu. 

<table style="width:75%;margin-left:auto;margin-right:auto;">
  <tr>
    <td><li>Equal Count (Quantile)</li></td>
    <td><img src= "Images/qgis-equal-count.png" alt="Equal Count" width="140" height="20"></td>
  </tr>
  <tr>
    <td><li>Equal Interval</li></td>
    <td><img src= "Images/qgis-equal-interval.png" alt="Equal Interval" width="140" height="20"></td>
  </tr>
  <tr>
    <td><li>Logarithmic Scale</li></td>
    <td><img src= "Images/qgis-log-scale.png" alt="Logarithmic Scale" width="140" height="20"></td>
  </tr>
  <tr>
    <td><li>Natural Breaks (Jenks)</li></td>
    <td><img src= "Images/qgis-jenks.png" alt="Natural BReaks (Jenks)" width="140" height="20"></td>
  </tr>
  <tr>
    <td><li>Pretty Breaks</li></td>
    <td><img src= "Images/qgis-pretty-breaks.png" alt="Pretty Breaks" width="140" height="20"></td>
  </tr>
  <tr>
    <td><li>Standard Deviation</li></td>
    <td><img src= "Images/qgis-sd.png" alt="Standard Deviation" width="140" height="20"></td>
  </tr>
</table>

<blockquote>
Adjusting the _mode_ value, describe how the visualization changes with each of these different options.
</blockquote>
Record your answers and submit at the conclusion of the lab.
</details>
<hr></hr>

<details>
<summary><big>View Directions in <b> [R]{style="color:#6495ED"} </b></big></summary>

Now that you have datasets for electric vehicle charging stations (object = **evs**) and the continental US (object = **us**), you need to combined that data to allow for the states to be color coded based on the number of charging stations per state. To do this you will use a function called ```merge``` from the base **R** functions that will allow you to combine the information from the **evs** and **us** into a single dataset that contains information from both based on a common variable. So to start you will need to determine what variable(s) are contained within each dataset. You have seen how to examine datasets using both ```head()``` and ```str()``` already in this exercise. Create a new code block and examine the structure of each dataset. You will see that there is a column for state name in each except they are labeled differently. This is important information you will need to properly ```merge``` the datasets.

To do this you will first create an object (**<-**) with a new name, then with the ```merge``` function set the following arguments:

- x, which is the first data set
- y, which is the second dataset
- by.x, identifies the column to use for the merge in x
- by.y, identifies the column to use for the merge in y
- all = TRUE, which tells the function to retain all data

So your final script will be:

```{r merged datasets, echo=TRUE, message=FALSE, warning=FALSE}
states <- merge(x = us, y = evs, by.x = "region", by.y = "state", all = TRUE)
head(states)
```

Now you will see the columns for **evs** and **abbreviation** included in the **us** dataset. This new dataset will be what you use to visualize the information in the next step.

<big><b>Question No. 2</b></big>
<blockquote>
Using ```sum(states)``` in a new code cell, what are the largest and smallest number of electric vehicle charging stations?
</blockquote>

</details>


## Step Three: The Visualization

You will learn how to create a graphical display of your data that includes cartographic elements such as legend, scale bar, north arrow, etc.

<details><summary><big>View directions in <b> [ArcGIS Pro]{style="color:#ff4500"} </b></span></big></summary>

_Insert Text Here_

</details>
<hr></hr>

<details><summary><big>View directions in <b> [QGIS]{style="color:#006400"} </b></span></big></summary>

Now it's time to turn your data into a map. From the menu bar in either macOS or PC click _Program > New Print Layout_. This will open a new window where you will add the data, title, legend, north arrow, scale bar, and your name and date.

<p align="center"><div class="zoom"> <img src= "Images/qgis-new-print-layer.png" alt="New Print Layout" style="width:100%"></div></p>

</details>
<hr></hr>

<details><summary><big>View directions in <b> [R]{style="color:#6495ED"} </b></span></big></summary>

With this new dataset you are now ready to create a map to examine the distribution of electric vehicle charging stations across the country. In the step one you used a very simple script to display the **us** data.

```
ggplot(us) + 
  geom_polygon(aes(x=long, y=lat, group=group), color = "white")
```

A similar script would allow you to quickly visualize the data ```ggplot(states) + geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white")```, however, you need to add a number of elements in order to create a map such as a scale bar, north arrow, title, etc. Additionally, you can customize other components to provide a better overall visualization.

Earlier in the notebook we installed the ```ggsn``` package. This package allows you to add "north symbols and scale bars for maps created with 'ggplot2' or 'ggmap'." So you can build on the script above to create a map of the information in the **states** dataset. To begin, run the script above with the added **fill** argument to see the outcome:

```{r basic evs map, echo=TRUE, message=FALSE, warning=FALSE}
ggplot(states) + geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white")
```

One thing you will notice is that the categories are very difficult to distinguish. Because you have wide ranging data in the _evs_count_ column only the largest value is showing. A simple fix to this would be to take the common logarithm of the data to standardize the data by removing the skewness towards large values. The function ```scale-viridis``` scales the data and provides a color map designed to be perceived by viewers with common forms of color blindness. So you can add ```scale_fill_viridis_c(option = "D", trans = "log10")``` to the script above where:

- scale_fill_viridis_c is a fill pattern for continuous data
- option = "D" is the default color option
  - There are five color options available with this function
    - A = magma
    - B = inferno
    - C = plasma
    - D = viridis
    - E = cividis
- trans = "log10" transforms the data using common logarithm, other options are available; see documentation

The new script should now look like this:

```{r log evs map, echo=TRUE, message=FALSE, warning=FALSE}
ggplot(states) + 
  geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
  scale_fill_viridis_c(option = "D", trans = "log10")
```

Now that you are able to visualize the separations in the data you can add additional information. You can start with customizing the labels, map title, and legend title. This can all be completed by adding a single line of code containing all of the text information for those items:

```{r log map w titles, echo=TRUE, message=FALSE, warning=FALSE}
ggplot(states) + 
    geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
  scale_fill_viridis_c(option = "D", trans = "log10") + 
  labs(x="Longitude",y="Latitude", title="Number of Electric Vehicle Charging Stations Per State", fill = "No. of Stations")
```

Feel free to edit the label names and the color option in the script to provide your own customizations. Next you need to add a scale bar and north arrow. To view the available options for the north arrow type ```northSymbols()``` into a new code block. The numeric values below each symbol will be used in the script to identify the specific style you choose. Because the north arrow, ```north```, is specifically related to the map data you need to provide the following arguments:

- dataset
- symbol, identified by the numerical value from ```northSymbols()```
- location, indicating where to base the location on the map
- anchor, coordinates for the symbol position on the map (based off the location)
- scale, the symbol size as a proportion of the map size 

So your new script will look like:

```{r map titles north, echo=TRUE, message=FALSE, warning=FALSE}
ggplot(states) + 
  geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
  scale_fill_viridis_c(option = "D", trans = "log10") + 
  labs(x="Longitude",y="Latitude", title="Number of Electric Vehicle Charging Stations Per State", fill = "No. of Stations") +
  north(states, location = "bottomleft", scale = 0.05, symbol = 12, anchor = c(x= -70, y= 25))
```
In this example, ```location = "bottomleft"``` means the location of the _north arrow_ will be based from the bottom left of the symbol and ``anchor = c(x = -70, y = 25)``` is the geographic location on the map to draw the symbol. For example, if the _anchors_ were set at -100 and 40 the symbol would be draw on the Nebraska/Kansas border. Feel free to adjust the anchor points to draw the north arrow in your preferred location.

Now you need to add a scale bar. Many of the arguments used for the north arrow are duplicated for ```scalebar```

- dataset
- location, indicating where to base the location on the map
- anchor, coordinates for the symbol position on the map (based off the location)
- distance for each unit of the scale bar
- unit of measurement such as mi, km, etc.
- transform (TRUE/FALSE), assumes the coordinates are in decimal degrees
- model, choice of ellipsoid; which will be discussed later in the semester
- st.size, scale bar size
- st.dist, distance between the scale bar and the scale bar’s text, as a proportion of the y axis

```{r map titles north scale, echo=TRUE, message=FALSE, warning=FALSE}
ggplot(states) + 
  geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
  scale_fill_viridis_c(option = "D", trans = "log10") + 
  labs(x="Longitude",y="Latitude", title="Number of Electric Vehicle Charging Stations Per State", fill = "No. of Stations") +
  north(states, location = "bottomleft", scale = 0.05, symbol = 12, anchor = c(x= -70, y= 25)) + 
  scalebar(states, dist = 250, dist_unit = "mi", transform = TRUE, model = "WGS84", location = "bottomleft", st.dist = 0.05, st.size = 2, anchor = c(x=-125,y=27))
```

As with all of the other customizations above, feel free to adjust the units, distance, text distance, and size based on your own style.

Finally, you will need to add text to the map to indicate the name of the person who created the map and the date. In the future you will possibly include references or other text based information. There are a number of different ways you will explore for adding text information to your maps, such as ```caption = ``` in labs, but for this example you will use ```annotate()```. Similar to the north arrow and scale bar, there will be a

- _x_ and _y_ argument to set the location
- size to indicate the font size
- label for the text you wish to include; to create a character return to move text to a new line you should use "\\n" where you want the text to move to a new line

Your final script should now look similar to this:

```{r final map, echo=TRUE, message=FALSE, warning=FALSE}
ggplot(states) + 
  geom_polygon(aes(x=long, y=lat, group=group, fill = evs_count), color = "white") +
  scale_fill_viridis_c(option = "D", trans = "log10") + 
  labs(x="Longitude",y="Latitude", title="Number of Electric Vehicle Charging Stations Per State", fill = "No. of Stations") +
  north(states, location = "bottomleft", scale = 0.05, symbol = 12, anchor = c(x= -70, y= 25)) + 
  scalebar(states, dist = 250, dist_unit = "mi", transform = TRUE, model = "WGS84", location = "bottomleft", st.dist = 0.05, st.size = 2, anchor = c(x=-125,y=27)) +
  annotate("text", x = -90, y = 25, label = "Your Name \n The Date", size = 2)
```

There is an exhaustive amount of modifications that can be applied to the map above, but for now you have the minimum information required to create a map of similar kinds of data. In future exercises you will use various function to customize the look of your maps. 

<b><big>Question No. 3</big></b>
<blockquote>
How does the _dist = _ &nbsp;argument in the ```scalebar``` function relate specifically to the distance of the scale bar on your map? How would changing the value alter the appearance?
</blockquote>

</details>

## Step Four: Your Turn

After a rash of severe weather over the past few years the Montgomery County Emergency Management Agency has asked you to provide a map detailing the number of reported tornadoes in each Tennessee county over the past several decades. This information will be shared with neighboring counties in middle Tennessee as a part of severe weather education campaign designed to inform communities about the risk of tornadoes in the region. The map should include all of the elements included on your previous map of electric vehicle charging stations such as:

- Title
- Scale
- North Arrow
- Legend
- Name/Date of Cartographer

Software specific directions can be found for each step below. Please submit the answer to the questions you answered above as well as your final tornado map by the due date.

<details><summary><big>View directions in <b> [ArcGIS Pro]{style="color:#ff4500"} </b></span></big></summary>

_Insert Text Here_

</details>
<hr></hr>

<details><summary><big>View directions in <b> [QGIS]{style="color:#006400"} </b></span></big></summary>

_Insert Text Here_

</details>
<hr></hr>

<details><summary><big>View directions in <b> [R]{style="color:#6495ED"} </b></span></big></summary>

This portion of the exercise is meant to reinforce the skills you learned in the part of the lab. The steps you will take to complete your final map will be to:

1. Create an object from the tornadoes dataset
2. Obtain a dataset of Tennessee counties
3. Determine which columns can be used to ```merge``` the datasets
4. Map out the data using ```ggplot```


To begin the exercise you will need this URL to the comma delimited dataset:

https://raw.githubusercontent.com/chrismgentry/GIS1-Exercise-2/main/Data/tn_tornadoes.csv

This data represents the number of reported tornadoes in each county from 1950-2020. As in step one of the exercise, you can use the **<-** operator to create a new object and the ```read.csv()``` function with the link to the dataset to import the data. Remember you can use ```head()``` or ```str()``` to examine the information.

To obtain county information for the State of Tennessee you should use the following script:

```{r tennessee, echo = TRUE, message=FALSE, warning=FALSE}
tn <- map_data('county', region = "tennessee")
```

If you search for _map_data_ in the **ggplot2** [documentation](https://cran.r-project.org/web/packages/ggplot2/ggplot2.pdf) you will find an example of a script used to isolate information for the State of Iowa that can be adapted for any state in the dataset. Remember when working with scripts, Google is your friend! All it requires is asking the correct question to find some example code online that can help guide you. There are numerous possible answers to the same problem so don't hesitate to try other methods.

Using the example code from step two, you will need to ```merge``` the two datasets into a new object based on a common variable such as <small><i>Hint...hint</i></small> county name.

Finally, by adapting the ```ggplot``` code in step four, you can map the information for _tornado_count_ for each county in Tennessee. In order to limit the scope of your map to just the state you should add the following script to your modified ```ggplot``` code from above:

```
coord_fixed(xlim = c(-90,-82), ylim = c(35, 37))
```

The ```coord_fixed()``` function limits the axes based on specified values. The values are based on the units of measure of the data. In this can you see where the **x axis** is limited (_xlim_) from -82&deg; to -90&deg; west longitude and the **y axis** is limited (_ylim_) from 35&deg; to 37&deg; north latitude. If you omit this script your map will likely have a "smashed" appearance. For creating maps in **R** it is generally advisable to set the _x_ and _y_ coordinates to ensure proper display of your data. Remember you will need to adjust the anchor points of your _name and date_ text, _north arrow_ and _scale bar_ to fit the current map view. Additionally, the _title_ and _legend text_ should also reflect the information depicted on your map.

<big><b>Question No. 4</b></big>
<blockquote>
Which county had the highest number of reported tornadoes?<br>
Type ```subset(tn_tornadoes, tornado_count == max(tornado_count))``` into a new code cell or use Google to search for a county map of Tennessee to determine county locations on your map.<br> 
<small>Hint: Replace **tn_tornadoes** in the code above with the object you created by merging the tornado and counties datasets.</small>
</blockquote>

</details>

# The Write-Up
The Montgomery County Emergency Management Agency has asked you to provide a map detailing the number of reported tornadoes in each Tennessee county over the past several decades. Based on the map you create above, complete a lab write-up that addresses the following questions:

- Provide the names of the five (5) counties that recorded the most tornadoes during that time frame
- Describe which regions of Tennessee had the fewest reported tornadoes 
- Inform MCEMA which metropolitan regions could be most impacted by future severe weather events

When complete, send link to your _Colab Notebook_ via email.
